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ABSTRACT 

Hosts and pathogens are locked in an 
evolutionary arms race. To infect mice, mouse 
hepatitis coronavirus (MHV) has evolved to 
recognize mouse CEACAMla 
(mCEACAMla) as its receptor. To elude 
MHV infections, mice may have evolved a 
variant allele from the Ceacamla gene, called 
Ceacamlb, producing mCEACAMlb that is a 
much poorer MHV receptor than 
mCEACAM 1 a. Previous studies showed that 
sequence differences between mCEACAMla 
and mCEACAMlb in a critical MHV-binding 
CC’ loop partially account for the low receptor 
activity of mCEACAMlb, but detailed 
structural and molecular mechanisms for the 
differential MHV receptor activities of 
mCEACAMla and mCEACAMlb remained 
elusive. Here we have determined the crystal 
structure of mCEACAMlb, and identified the 


structural differences and additional residue 
differences between mCEACAMla and 
mCEACAMlb that affect MHV binding and 
entry. These differences include 
conformational alterations of the CC’ loop as 
well as residue variations in other MHV- 
binding regions, including [l-strands C’ and C” 
and loop C’C”. Using pseudovirus entry and 
protein-protein binding assays, we show that 
substituting the structural and residue features 
from mCEACAMlb into mCEACAMla 
reduced the viral receptor activity of 
mCEACAM 1 a, whereas substituting the 
reverse changes from mCEACAM 1 a into 
mCEACAMlb increased the viral receptor 
activity of mCEACAMlb. These results 
elucidate the detailed molecular mechanism 
for how mice may have kept pace in the 
evolutionary arms race with MHV by 
undergoing structural and residue changes in 
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the MHV receptor, providing insight into this 
possible example of pathogen-driven 
evolution of a host receptor protein. 


According to the Red Queen 
hypothesis, hosts and pathogens are in an 
evolutionary arms race to keep pace with each 
other for fitness and survival (1,2). 
Coronaviruses are a large family of ancient 
and diverse RNA virus pathogens that infect 
many mammalian and avian species (3,4). 
Different coronaviruses use a variety of cell 
surface receptors for entry into host cells 
through the activities of virus-surface spike 
proteins (5,6). The host receptor-adapting 
evolution of coronavirus spike proteins has 
been extensively studied (3-6), but 
coronavirus-driven evolution of host receptors 
is much less well understood. The current 
study investigates how a host receptor may 
undergo molecular changes under possible 
selective pressure from lethal coronavirus 
infections and how these changes may help the 
host to resist death from coronavirus 
infections. 

As the prototypic member of the 
coronavirus family, mouse hepatitis 
coronavirus (MHV) presents a good model 
system for studying the co-evolutionary 
relationship between viruses and hosts. 
Depending on the strain, MHV can cause 
enteric, respiratory, or brain infections in 
mice. The enterotropic strains of MHV spread 
widely in susceptible mouse populations and 
are lethal in infant mice of many inbred strains 
(up to 100% fatality) (7). Infection with MHV 
is a major concern in laboratory mice because 
it can disrupt mouse-based research through 
clinical disease and/or alteration of 
immunologic responses (8). MHV uses a cell- 
surface protein, mouse Carcinoembryonic 
Antigen-related Cell Adhesion Molecule la 
(mCEACAMla), as its host receptor (9,10). 
The CEACAM1 protein is widely expressed in 
all mammals on the membranes of epithelial 
cells, endothelial cells, and leukocytes (11). It 
mediates cell-cell adhesion and signaling, and 
participates in the differentiation and 
arrangement of tissue three-dimensional 
structure, angiogenesis, apoptosis, tumor 


suppression, cancer metastasis, and the 
modulation of innate and adaptive immune 
responses (12). The envelope-anchored MHV 
spike glycoprotein specifically recognizes 
mCEACAM 1 a through the N-terminal domain 
of its SI subunit (Sl-NTD) (9,13,14). Our 
previous structural studies revealed that 
coronavirus S1 -NTDs have the same tertiary 
structural fold as human galectins (galactose¬ 
binding lectins), and that whereas MHV S1 - 
NTD recognizes mCEACAM 1, bovine 
coronavirus (BCoV) Sl-NTD recognizes 
sugar (15,16). Thus, we proposed that 
coronaviruses acquired a host galectin gene 
and inserted it into its spike protein gene, and 
that whereas BCoV Sl-NTD has kept its 
original sugar-binding lectin activity, MHV 
Sl-NTD has evolved novel mCEACAM la- 
binding affinity and lost its original sugar¬ 
binding lectin activity. These studies have 
provided insight into the host receptor- 
adapting evolution of coronaviruses (5,6). 

To respond to the selective pressure 
from lethal MHV infections, mice may have 
evolved a variant allele from the Ceacamla 
gene, called Ceacamlb; of the two gene 
products, mCEACAM lb is a much less 
efficient MHV receptor than mCEACAMla 
(17-19). Correspondingly, mice homozygous 
for Ceacamlb (lb/lb) are resistant to death 
from MHV infections, while mice 
homozygous for Ceacamla (la/la) are highly 
susceptible to lethal MHV infections 
(7,9,10,20,21). Other than their different MHV 
receptor activities, mCEACAM 1 a and 
mCEACAM lb appear to be functionally 
equivalent: neither la/la mice or lb/lb mice 
show any growth defects, while a 
dysfunctional Ceacaml gene leads to impaired 
insulin clearance, abnormal weight gain, and 
reduced fertility (22). Our previous structural 
studies of mCEACAMla and its complex with 
MHV S1 -NTD have delineated detailed 
interactions between mCEACAM 1 a and MHV 
Sl-NTD (15,23). Moreover, previous studies 
identified the CC’ loop (loop that connects P- 
strands C and C’) in mCEACAMla as critical 
for MHV binding; the sequence of this loop 
diverges in mCEACAM lb, partially 
accounting for the low MHV receptor activity 
of mCEACAMlb (17,24). However, due to 
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the lack of structural information about 
mCEACAMlb, it was not known what 
structural differences between mCEACAM 1 a 
and mCEACAMlb or whether additional 
residue differences between mCEACAM la 
and mCEACAMlb account for the MHV 
resistance in mice homozygous for Ceacamlb. 

In this study, we have determined the 
crystal structure of mCEACAMlb, and 
elucidated the structural differences and 
additional residue differences between 
mCEACAM la and mCEACAMlb that 
impede the binding of MHV S1 -NTD to 
mCEACAMlb. Moreover, we have performed 
structure-guided mutagenesis studies on 
mCEACAM la and mCEACAMlb to 
investigate the significance of their structural 
and sequence differences upon their MHV 
receptor activities. These results provide 
insight into the possibility that MHV has 
driven the evolution of the mCEACAM 1 
protein in mice. 

RESULTS 

Because of alternative mRNA 
splicing, mCEACAM 1 contains either two 
[D1 and D4] or four [D1-D4] Ig-like domains 
in tandem, in addition to a transmembrane 
anchor and a short intracellular tail at its C- 
terminus (12). mCEACAMlb[Dl,D4] 
(residues 1-202) without the membrane anchor 
or the intracellular tail was expressed and 
purified as previous described for 
mCEACAMla[Dl,D4] (15). It was 
subsequently crystallized in space group 
P3i21, a=113.lA, b=113.lA, and c=64.4A. 
Although each asymmetric unit of the crystal 
contains two mCEACAMlb[Dl,D4] 
molecules, the protein is a monomer in 
solution based on gel filtration 
chromatography. The structure was 
determined by molecular replacement using 
the structure of mCEACAM 1 a[D 1 ,D4] as the 
search template, and refined at 3.lA resolution 
(Fig. 1A; Table 1). The final model contains 
all of the residues in domains D1 and D4, and 
a glycan N-linked to Asn270. 

The overall structure of 
mCEACAMlb[Dl,D4] is similar to that of 
mCEACAM 1 a[D 1 ,D4], but the structural 


similarity is uneven in different regions of the 
two proteins. In both mCEACAM la and 
mCEACAMlb, the two Ig-like domains, D1 
and D4, are arranged in tandem without any 
significant interactions with each other (Fig. 
1A, IB). In mCEACAMla, the D1 domain 
binds to MHV S1 -NTD, whereas the D4 
domain has no contact with MHV S1 -NTD 
(Fig. 1C). In the D1 domain of mCEACAMla, 
several loops (CC’, C’C”, C”D, and FG) and 
P-strands (PC, PC’, and PC”) are directly 
involved in MHV binding, and thus these 
regions have been called the virus-binding 
motifs (VBMs) (Fig. 1C, ID). Interestingly, 
the D1 domains of mCEACAMla and 
mCEACAMlb are significantly more 
divergent in both primary structure (sequence 
identity = 74%) and tertiary structure (main 
chain RMSD = 1.11 A) than the D4 domains 
(sequence identity = 98%; main chain RMSD 
= 0.77 A) (Fig. 2). Furthermore, within the D1 
domain, the VBMs of mCEACAMla and 
mCEACAMlb are more divergent in both 
primary structure (sequence identity = 56%) 
and tertiary structure (main chain RMSD = 

1.36 A) than the non-virus-binding regions are 
(Fig. 2C, 2D). These results suggest that 
compared with the rest of the protein, the 
VBMs in the D1 domains of mCEACAMla 
have been under strong selective pressure 
possibly from MHV infections. 

Further inspection of the structures of 
mCEACAMla and mCEACAMlb has 
identified detailed structural divergence 
between the VBMs of the two proteins. In 
mCEACAMla, a critical CC’ loop (loop that 
connects P-strands C and C’) in the D1 
domain interacts extensively with MHV Sl- 
NTD and thus plays a prominent role in the 
virus/receptor binding interactions (Fig. 3A). 
These interactions include the multiple 
hydrophobic interactions between the side 
chain of receptor Ile41 and the side chains of 
MHV Tyrl5, Leu89, and Leul60 as well as 
the hydrogen bonds between the carbonyl 
oxygen of receptor Thr39 and the side chain of 
MHV Arg20. Compared to mCEACAMla, 
mCEACAMlb has undergone significant 
structural changes in loop CC’ (Fig. 3B, 3C), 
which result from residue changes in this loop. 
For example, residue 38 is a threonine in 
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mCEACAM 1 a but a proline in 
mCEACAMlb. This residue change likely has 
a significant impact on the conformation of 
loop CC’ because prolines are known to cause 
changes to protein secondary structures. 
Additionally, a number of residues in other 
VBM regions of mCEACAM la also form 
critical interactions with MHV Sl-NTD, but 
have been substituted with different residues 
in mCEACAMlb. For example, in 
mCEACAM 1 a, the side chain of receptor 
Arg47 in strand pC’ forms hydrogen bonds 
with the carbonyl oxygens of MHV Gln23 and 
Val25, the side chains of receptor Met54 and 
Phe56 in strand [SC ’ are part of a hydrophobic 
cluster at the Sl-NTD/receptor interface, and 
the side chain of receptor Asn59 in strand [3C’ 
forms hydrophobic stacking with the Ca of 
Sl-NTD Gly29 (Fig. 4A, 4B). However, 
Arg47, Met54, Phe56, and Asn59 in 
mCEACAM 1 a have been substituted with 
His47, Lys54, Thr56, and Pro59, respectively, 
in mCEACAMlb (Fig. 4C, 4D). The above 
structural and residue changes in the VBMs 
from mCEACAM la to mCEACAMlb would 
lead to the loss of numerous energetically 
favorable interactions at the S1 -NTD/receptor 
interface and disrupt the virus/receptor binding 
interactions. These structural analyses further 
suggest that the VBMs in the D1 domain of 
mCEACAM 1 have been under strong 
selective pressure possibly from MHV 
infections. 

To investigate how the structural and 
residue differences between mCEACAM la 
and mCEACAMlb affect their functions as 
MHV receptor, we carried out structure- 
guided mutagenesis and introduced structural 
and residue features from mCEACAMlb into 
mCEACAM 1 a. These structural and residue 
changes include replacing loop CC’ in 
mCEACAM 1 a with its counterpart from 
mCEACAMlb, and substituting residues 47, 
54, 56, and 59 in mCEACAM 1 a with the 
corresponding residues from mCEACAMlb. 

A pseudovirus entry assay was performed 
where a lentiviral vector pseudotyped with the 
MHV spike protein was used to enter 
mammalian cells expressing either wild type 
or mutant mCEACAM la on their surface. The 
results demonstrated that each of the structural 


and residue features from mCEACAMlb 
introduced into mCEACAM la significantly 
reduced the efficiency of pseudovirus entry 
(Fig. 5A), reflecting a weaker binding affinity 
between the MHV spike protein and the 
mutant mCEACAM 1 a. Thus, the structural 
and residue changes from mCEACAM 1 a to 
mCEACAMlb reduced the capability of 
mCEACAM la to serve as the MHV receptor. 
These loss-of-function experiments mimic the 
possible loss-of-function evolution of the 
mouse Ceacamla gene under the selective 
pressure from MHV infections. 

To further explore the functional 
significance of the structural and residue 
differences between mCEACAM la to 
mCEACAMlb, we introduced the reverse 
substitutions (i.e., the features from 
mCEACAMlb introduced into 
mCEACAM la). These structural and residue 
changes include replacing (i) loop CC’, (ii) 
both loop CC’ and strand [3C', or (iii) all of 
loop CC’, strand (3C, loop C'C", and strand 
[3C" in mCEACAMlb with the corresponding 
regions from mCEACAM 1 a. In addition to the 
pseudovirus entry assay, protein-protein 
binding assays were also performed between 
the MHV S1 -NTD and wild type or mutant 
mCEACAMlb. The results showed that all of 
the structural and residue changes introduced 
into mCEACAMlb significantly enhanced 
both the pseudovirus entry efficiency and 
protein-protein binding affinity (Fig. 5A, 5B). 
Among the mutant mCEACAMlb molecules, 
the one containing changes in loop CC’, strand 
PC', loop C'C", and strand PC" all together 
demonstrated the highest MHV receptor 
activity. More specifically, introduction of the 
above structural and residue changes into 
mCEACAMlb restored the receptor activity 
of mCEACAM lb up to -67% of 
mCEACAM la based on the pseudovirus entry 
efficiency and -83% of mCEACAM la based 
on the protein-protein binding affinity. It is 
worth noting that incorporation of the above 
structural and residue features from 
mCEACM 1 a did not fully restore the MHV 
receptor activity of mCEACAMlb to the same 
level as mCEACAM la, suggesting that 
structural and/or residue differences elsewhere 
in domain D1 may account for the remaining 
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difference between mCEACAMla and 
mCEACAMlb in their MHV receptor 
activities. Nevertheless, these gain-of-fiinction 
experiments represent the reverse course of 
the possible loss-of-function evolution of the 
mouse Ceacaml gene under the selective 
pressure from MHV infections. 

DISCUSSION 

The Red Queen hypothesis states that 
hosts and pathogens are constantly in an 
evolutionary arms race. Previous structural 
studies of the coronavirus/receptor interactions 
have revealed how coronaviruses have 
evolved a variety of strategies to recognize 
different host receptors for host range 
expansion and cross-species infections 
(5,6,15,16,23,25-31). One of these strategies 
would be for coronaviruses to steal a host 
galectin, which became the Sl-NTD of the 
coronavirus spike protein, and use it to bind 
sugar on host cell surfaces for viral attachment 
to host cells. While the Sl-NTDs of many 
contemporary coronaviruses still recognize 
sugar receptors (32-38), MHV Sl-NTD has 
evolved novel binding affinity for 
mCEACAM 1 a protein, which greatly 
enhanced the infection efficiency of MHV in 
mouse cells (15,16) (Fig. 6). Through the 
above evolution of its spike protein, MHV 
appears to have gained a significant edge in 
the evolutionary arms race with mice and 
become a highly infectious and pathogenic 
virus for mice. 

How have mice evolved to keep pace 
in the evolutionary arms race with MHV? An 
interesting observation is that the mouse 
Ceacaml gene has diversified into two alleles, 
Ceacaml a and Ceacamlb. Their protein 
products, mCEACAMla and mCEACAMlb, 
demonstrate different MHV receptor 
activities: mCEACAMlb is a much poorer 
MHV receptor than mCEACAM 1 a. 
Consequently, mice homozygous for 
Ceacamlb are highly resistant to death from 
MHV infections. The selective pressures that 
drive the evolution of mammalian Ceacaml 
genes could come from several sources. 
Mammalian CEACAM1 functions in many 
physiological processes including cell-cell 
adhesion, cell signaling, and cell development 


(12). In addition, human CEACAM 1 is a 
receptor for a variety of bacterial pathogens 
(39-41). Thus, the physiological functions of 
CEACAM 1, host evasion of bacterial 
infections, or some other unknown selective 
pressures could potentially drive the evolution 
of mammalian Ceacaml genes. Although the 
physiological functions of mCEACAMlb 
need to be investigated further, mice 
homozygous for Ceacamlb apparently retain 
all the normal phenotypes of Ceacaml a, 
suggesting no major alterations of the 
physiological functions of mCEACAMlb. 
Moreover, mouse CEACAM la is not a 
receptor for those bacterial pathogens that use 
human CEACAM 1 as the receptor (42). On 
the other hand, MHV infections can be 
devastating to infant mice that express 
mCEACAM 1 a. Therefore, while other 
selective pressures cannot be ruled out, MHV 
infection is likely one of the major driving 
forces for the evolution of mouse Ceacaml 
gene. There are insufficient data on mouse 
genomes to prove which one of the mouse 
Ceacaml alleles evolved first. Based on the 
above discussion, we suggest that in the 
mouse population the Ceacaml a allele 
preceded the appearance and maintenance of 
the Ceacamlb allele in the presence of MHV 
epidemics. 

This study investigates the structural 
and residue differences between 
mCEACAMla and mCEACAMlb that render 
mCEACAMlb a less efficient MHV receptor. 
Previous studies identified a critical MHV- 
binding loop CC’ that diverges in sequence 
between mCEACAMla and mCEACAMlb, 
partially accounting for the different MHV 
receptor activities of the two proteins (17,24). 
The current study reveals the altered 
conformation of loop CC’ in the crystal 
structure of mCEACAMlb, providing a 
structural basis for the critical role of loop CC’ 
in the different MHV receptor activities of the 
two mCEACAM 1 molecules. Furthermore, 
this study identifies residue variations in 
several other MHV-binding regions in 
mCEACAMlb that render mCEACAMlb a 
poor MHV receptor. These regions include |3- 
strands C’ and C” and loop C’C”. Using 
structure-guided mutational and functional 
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assays, this study shows that the structural and 
residue substitutions from mCEACAM 1 a into 
mCEACAMlb cause a loss of receptor 
function in mCEACAM la, whereas the 
reverse substitutions cause a gain of receptor 
function in mCEACAMlb. The structural and 
residue changes from mCEACAM la to 
mCEACAMlb mimic the possible loss-of- 
function evolution in mCEACAM la during 
pathogen-driven host evolution, which would 
result in less severe MHV infections in mice 
and partial alleviation of the selective pressure 
from MHV infections (Fig. 6). Therefore, it is 
likely that through divergent evolution of its 
Ceacaml gene to generates the Ceacamlb 
allele, mice may have gained the ability to 
keep pace in the evolutionary arms race with 
MHV for fitness and survival. Overall, the 
current study provides insight into a possible 
example of coronavirus-driven evolution of 
mouse receptor protein, lending molecular 
evidence to the Red Queen hypothesis. 

MATERIALS AND METHODS 

Protein preparation and 
crystallization- mCEACAMlb[Dl, D4] 
(residues 1-202) was expressed and purified as 
previously described for 
mCEACAMla[Dl,D4] (15). Briefly, 
mCEACAM 1 b [D1, D4] 
containing a C-terminal His 6 tag was 
expressed in sf9 insect cells using the Bac-to- 
Bac expression system (Life Technologies), 
and was secreted into cell culture medium. 

The protein was harvested and loaded onto a 
nickel-nitrilotriacetic acid (Ni-NTA) column, 
eluted from the Ni-NTA column with 
imidazole, and further purified by gel filtration 
chromatography on Superdex 200 (GE 
Healthcare). The protein was concentrated to 
10 mg/ml and stored in buffer containing 20 
mM Tris pH7.2 and 200 mM NaCl. 
Crystallization of mCEACAM lb[Dl, D4] was 
set up using the sitting drop vapor diffusion 
method, with 1 pi protein solution added to 1 
pi well buffer containing 0.1 M Tris pH6.2, 
10% PEG4000 (v/v), and 1 M NaCl at 20°C. 
Crystals of mCEACAMlb[Dl, D4] appeared 
in 2-3 days and were allowed to grow for 
another 2 weeks before they were harvested 
and flash-frozen in liquid nitrogen. 


Data collection and structure 
determination- X-ray diffraction data was 
collected at the Advanced Light Source 
beamline 4.2.2 and processed using HKL2000 
(43). The structure of mCEACAM lb[Dl,D4] 
was determined by molecular replacement 
using mCEACAM 1 a[D 1 ,D4] (PDB 3R4D) as 
the search template. The model was built 
using Coot (44) and refined with Refmac (45) 
to a final R WO rk and Rf ree of 0.216 and 0.273, 
respectively. 

Pseudovirus entry assay- Lentiviruses 
pseudotyped with MHV spike protein were 
produced as previously described (15). 

Briefly, pcDNA3.1(+) plasmid encoding 
MHV spike protein (from MHV strain A59) 
was co-transfected into HEK293T cells with 
helper plasmid psPAX 2 and reporter plasmid 
pLenti-GFP at molar ratio 1:1:1 using 
lipofectamine 2000 (Life Technologies). 48 
hours post-transfection, the produced 
pseudovirus particles were harvested and 
inoculated onto the HEK293T cells expressing 
mCEACAM la or mCEACAMlb (wild type or 
mutant). 48 hours post-infection, cells were 
observed under fluorescent microscope, the 
percentage of GFP-expressing cells was 
calculated using ImageJ (National Institutes of 
Health). The expression levels of 
mCEACAM la and mCEACAMlb in 
HEK293T cells were measured by Western 
blotting using antibodies against their C- 
terminal C9 tag, quantified using ImageJ, and 
presented as relative expressions in 
comparison to wild-type mCEACAM 1 a. The 
relative expression of each receptor was used 
to normalize pseudovirus entry efficiency. The 
experiments were further repeated twice, and 
similar results were obtained. 

Protein-protein interaction assay 
using AlphaScreen- The interactions between 
recombinant MHV S1 -NTD and recombinant 
mCEACAM la or mCEACAMlb (wild type or 
mutant) were measured using AlphaScreen as 
previously described (46,47). Briefly, 300 nM 
MHV SI-NTD with a C-terminal His6 tag 
were mixed with 30 nM mCEAMCAMla or 
mCEACAMlb (wild type or mutant) with a C- 
terminal human IgG 4 Fc tag in OptiPlate-96 
(PerkinElmer) for 1 hour at room temperature. 
AlphaScreen Nickel Chelate Donor Beads and 
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AlphaScreen protein A acceptor beads 
(PerkinElmer) were added to the mixture at 
final concentrations of 20 pg/ml. The mixture 
was incubated at room temperature for 1 hour, 


protected from light. The assay plates were 
read in an EnSpire plate reader (PerkinElmer). 
The experiments were further repeated twice, 
and similar results were obtained. 
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Table 1. Data collection and refinement statistics 



mCEAMCAMlb 

Data Collection 

Space group 

P3 r21 

Cell dimensions 

a, b, c (A) 

113.097, 113.097, 64.380 

a, P, y (°) 

90, 90, 120 

Resolution (A) 

16.97-3.10 (3.21-3.10) a 

^sym 

0.114(0.507) 

I/al 

9.15 (1.78) 

Completeness (%) 

96.3 (86.4) 

Redundancy 

2.5 (2.1) 

Refinement 

Resolution (A) 

16.97-3.10 

No. reflections 

8557 

-^work/^free 

0.216/0.273 

No. atoms 

1579 

Protein 

1541 

Ligand 

28 

Water 

10 

R-factors (A 2 ) 

40.49 

Protein 

40.57 

Ligand 

41.59 

Water 

23.83 

RMSD 

Bond lengths (A) 

0.015 

Bond angles (°) 

1.46 

Ramachandran plot 

Favored (%) 

93.4 

Outliers (%) 

0.51 


a Vallies in parentheses are for highest-resolution shell. 
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FIGURE LEGENDS 

FIGURE 1. Overall structure of mouse CEACAMlb. (A) Crystal structure of 
mCEACAMlb[Dl, D4] containing the D1 and D4 domains. Secondary structures in domain D1 
of mCEACAMlb[Dl, D4] are named following the mCEACAMla[Dl, D4] structure (23). (B) 
Crystal structure of mCEACAMla[Dl, D4] (PDB 1L67) (23). Virus-binding motifs (VMBs) are 
in blue. (C) Crystal structure of mCEACAMla[Dl, D4] complexed with MHV Sl-NTD (PDB 
3R4D) (16). MHV Sl-NTD is in cyan, with RBMs in red. (D) Sequence alignment of 
mCEACAMlb[Dl, D4] and mCEACAMla[Dl, D4]. P-strands are shown as arrows. VBMs in 
mCEACAMla and the corresponding regions in mCEACAMlb are in blue. Asterisks indicate 
positions that have fully conserved residues; colons indicate positions that have strongly 
conserved residues; periods indicate positions that have weakly conserved residues. The boundary 
between domains D1 and D4 is indicated by a black line. Sequence alignment was done using 
CLUSTAL W (48). Structural illustrations were done using PyMol (49). 

FIGURE 2. Structural comparisons of mCEACAMla and mCEACAlb. (A) Overlay of 
mCEACAMla and mCEACAMlb in domain Dl. mCEACAMla is in black, and mCEACAMlb 
in yellow. VBMs in mCEACAMla and the corresponding regions in mCEACAMlb are in blue. 
(B) Overlay of mCEACAMla and mCEACAMlb in domain D4. (C) Structure of mCEACAMlb 
showing distribution of the residues that differ between mCEACAMla and mCEACAMlb. 
Regions corresponding to VBMs in mCEACAMla are in blue. Residues that differ between 
mCEACAMla and mCEACAMlb are shown as balls and sticks. (D) Sequence identities and 
RMSDs between mCEACAMla and mCEACAMlb in different regions. See Figure ID for the 
residue ranges of domains and VBMs. RMSDs were calculated using Coot (50). 

FIGURE 3. Structural comparisons of mCEACAMla and mCEACAMlb in MHV-binding 

loop CC’. (A) Interactions between MHV Sl-NTD (from MHV strain A59) and loop CC’ of 
mCEACAMla. mCEACAMl residues are in green, and Sl-NTD residues in magenta. 
Hydrophobic interactions are indicated as arrows, and hydrogen bonds as dotted lines. (B) 
Structure of loop CC’ of mCEACAMlb. (C) Overlay of the CC’ loops from mCEACAMla and 
mCEACAMlb. 

FIGURE 4. Structural comparisons of mCEACAMla and mCEACAMlb in MHV-binding 
strands PC’ and PC”. (A) Interactions between MHV Sl-NTD and Arg47 in strand PC’ of 
mCEACAMla. (B) Interactions between MHV Sl-NTD and Met54, Phe56, and Asn59 in strand 
PC” of mCEACAMla. (C) Conformation of the side chain of His47 in strand PC’ of 
mCEACAMlb. (D) Conformations of the side chains of Lys54, Thr56, and Pro59 in strand PC” 
of mCEACAMlb. 

FIGURE 5. Structure-guided mutational and functional characterizations of mCEACAMla 
and mCEACAMlb. (A) Pseudovirus entry efficiency. Lentiviruses pseudotyped with MHV 
spike protein (from MHV strain A59) were used to enter HEK293T cells expressing 
mCEACAMla or mCEACAMlb (wild type or mutant). The relative expression of each receptor 
was used to normalize pseudovirus entry efficiency. The pseudovirus entry mediated by wild-type 
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mCEACAMla was taken as 100%. Error bars indicate standard errors of the means (n = 3). 
Comparisons between wild-type mCEACAM 1 a and mutant mCEACAM 1 a, or between wild-type 
mCEACAMlb and mutant mCEACAM lb in their capabilities to mediate pseudovirus entry were 
done using two-tailed /-test (**, /*<0.01; ***, .P<0.001). The mutant mCEACAMla molecules 
contain single mutations R47H, M54K, F56T, Q59P, or loop CC’ from mCEACAMlb (residues 
38-43). The mutant mCEACAMlb molecules contain loop CC’ from mCEACAMla (residues 
38-43), loop CC’ and strand PC’ from mCEACAMla (residues 38-51), or loop CC’, strand PC’, 
loop C’C”, and strand PC” from mCEACAMla (residues 38-59). (B) Protein-protein binding 
assay. The interactions between MHV Sl-NTD and mCEACAMla or mCEACAMlb (wild type 
or mutant) were measured using AlphaScreen assay. MHV Sl-NTD with a C-terminal His 6 tag 
and mCEAMCAMla or mCEACAMlb (wild type or mutant) with a C-terminal human IgG 4 Fc 
tag were attached to AlphaScreen Nickel Chelate Donor Beads and Alpha Screen protein A 
acceptor beads, respectively. Error bars indicate standard errors of the means (n = 3). 

Comparisons between wild-type mCEACAMlb and mutant mCEACAMlb in their binding 
affinity for MHV Sl-NTD were done using two-tailed /-test (***, .P<0.001). 

FIGURE 6. Proposed evolutionary arms race between MHV and mice. The race includes 
both MHV-driven evolution of mice (top) and mouse-adapting evolution of MHV (bottom). See 
text for detailed discussion. 
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